Linux uniq 命令用于检查及删除文本文件中重复出现的行列,一般与 sort 命令结合使用。uniq 可检查文本文件中重复出现的行列。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
$ cat testfile #原有内容 test 30 test 30 test 30 Hello 95 Hello 95 Hello 95 Hello 95 Linux 85 Linux 85 uniq testfile $ uniq testfile #删除重复行后的内容 test 30 Hello 95 Linux 85
当重复的行并不相邻时,uniq 命令是不起作用的,即若文件内容为以下时,uniq 命令不起作用:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
$ cat testfile1 # 原有内容 test 30 Hello 95 Linux 85 test 30 Hello 95 Linux 85 test 30 Hello 95 Linux 85 #这时我们就可以使用 sort: $ sort testfile1 | uniq Hello 95 Linux 85 test 30
AWK 是一种处理文本文件的语言,是一个强大的文本分析工具。之所以叫 AWK 是因为其取了三位创始人 Alfred Aho,Peter Weinberger, 和 Brian Kernighan 的 Family Name 的首字符。awk就是把文件逐行的读入,以空格为默认分隔符将每行切片,切开的部分再进行各种分析处理。
[root@liruilong ~]# awk '{print $1,$2}' log.txt 2 s 3 Are This's a 10 There [root@liruilong ~]# cat log.txt 2 s is a test 3 Are you like awk This's a test 10 There are orange,apple,mongo [root@liruilong ~]# awk '{printf "%-8s %-10s\n",$1,$4}' log.txt 2 a 3 like This's 10 orange,apple,mongo [root@liruilong ~]#
awk -F #-F相当于内置变量FS, 指定分割字符
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
#使用","分割 [root@liruilong ~]# awk -F, '{print $1,$2}' log.txt 2 s is a test 3 Are you like awk This's a test ' 10 There are orange apple # 或者使用内建变量 [root@liruilong ~]# awk 'BEGIN{FS=","} {print $1,$2}' log.txt 2 s is a test 3 Are you like awk This's a test ' 10 There are orange apple #使用多个分隔符.先使用空格分割,然后对分割结果再使用","分割 [root@liruilong ~]# awk -F '[ ,]' '{print $1,$2,$5}' log.txt 2 s test 3 Are awk This's a 10 There apple [root@liruilong ~]#
awk -v # 设置变量
1 2 3 4 5 6 7 8 9 10 11
[root@liruilong ~]# awk -va=1 '{print $1,$1+a}' log.txt 2 3 3 4 This s 1 10 11 [root@liruilong ~]# cat log.txt 2 s is a test 3 Are you like awk This s a test 10 There are orange,apple,mongo [root@liruilong ~]#
awk -f {awk脚本} {文件名}
1
$ awk -f cal.awk log.txt
运算符
描述
= += -= *= /= %= ^= **=
赋值
?:
C条件表达式
||
逻辑或
&&
逻辑与
~ 和 !~
匹配正则表达式和不匹配正则表达式
< <= > >= != ==
关系运算符
空格
连接
+ -
加,减
* / %
乘,除与求余
+ - !
一元加,减和逻辑非
^ ***
求幂
++ –
增加或减少,作为前缀或后缀
$
字段引用
in
数组成员
过滤第一列大于2的行
1 2 3 4 5 6 7 8 9 10 11
[root@liruilong ~]# cat log.txt 2 s is a test 3 Are you like awk This's a test 10 There are orange,apple,mongo [root@liruilong ~]# [root@liruilong ~]# [root@liruilong ~]# awk '$1>2' log.txt 3 Are you like awk This's a test 10 There are orange,apple,mongo
过滤第一列等于2的行
1 2
[root@liruilong ~]# awk '$1==2 {print $1,$3}' log.txt 2 is
过滤第一列大于2并且第二列等于'Are'的行
1 2 3
[root@liruilong ~]# awk '$1>2 && $2=="Are" {print $1,$2,$3}' log.txt 3 Are you [root@liruilong ~]#
[root@liruilong ~]# cat log.txt 2 s is a test 3 Are you like awk This s a test 10 There are orange,apple,mongo [root@liruilong ~]# awk '$2 ~ /Th/' log.txt 10 There are orange,apple,mongo [root@liruilong ~]# awk '$2 ~ /Th/ {print $2,$4}' log.txt There orange,apple,mongo [root@liruilong ~]#
~ 表示模式开始。// 中是模式。
1 2 3
[root@liruilong ~]# awk '/yo/ ' log.txt 3 Are you like awk [root@liruilong ~]#
忽略大小写
1 2 3 4 5 6 7 8
[root@liruilong ~]# awk 'BEGIN{IGNORECASE=1} /this/' log.txt This's a test [root@liruilong ~]# cat log.txt 2 s is a test 3 Are you like awk This's a test 10 There are orange,apple,mongo [root@liruilong ~]#
模式取反
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
[root@liruilong ~]# awk '$2 !~ /th/ {print $2,$4}' log.txt s a Are like a There orange,apple,mongo [root@liruilong ~]# cat log.txt 2 s is a test 3 Are you like awk This's a test 10 There are orange,apple,mongo [root@liruilong ~]# awk '!/th/ {print$2,$4}' log.txt s a Are like a There orange,apple,mongo [root@liruilong ~]#
awk脚本
关于 awk 脚本,我们需要注意两个关键词 BEGIN 和 END。
BEGIN{ 这里面放的是执行前的语句 }
END {这里面放的是处理完所有的行后要执行的语句 }
{这里面放的是处理每一行时要执行的语句} 假设有这么一个文件(学生成绩表):
1 2 3 4 5 6 7
[root@liruilong ~]# cat score.txt Marry 2143 78 84 77 Jack 2321 66 78 45 Tom 2122 48 77 71 Mike 2537 87 97 95 Bob 2415 40 57 62 [root@liruilong ~]#
[root@liruilong ~]# sed -n '10p' file.txt Line 10 [root@liruilong ~]# awk 'NR==10' file.txt Line 10 [root@liruilong ~]# cat file.txt Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8 Line 9 Line 10 [root@liruilong ~]#